Deep Web Search Interface Identification: A Semi-Supervised Ensemble Approach
نویسندگان
چکیده
منابع مشابه
Deep Web Search Interface Identification: A Semi-Supervised Ensemble Approach
To surface the Deep Web, one crucial task is to predict whether a given web page has a search interface (searchable HyperText Markup Language (HTML) form) or not. Previous studies have focused on supervised classification with labeled examples. However, labeled data are scarce, hard to get and requires tedious manual work, while unlabeled HTML forms are abundant and easy to obtain. In this rese...
متن کاملTITPI: Web People Search Task Using Semi-Supervised Clustering Approach
Most of the previous works that disambiguate personal names in Web search results employ agglomerative clustering approaches. However, these approaches tend to generate clusters that contain a single element depending on a certain criterion of merging similar clusters. In contrast to such previous works, we have adopted a semisupervised clustering approach to integrate similar documents into a ...
متن کاملSemi-Supervised Ensemble Ranking
Ranking plays a central role in many Web search and information retrieval applications. Ensemble ranking, sometimes called meta-search, aims to improve the retrieval performance by combining the outputs from multiple ranking algorithms. Many ensemble ranking approaches employ supervised learning techniques to learn appropriate weights for combining multiple rankers. The main shortcoming with th...
متن کاملA Semi-Supervised Approach for Gender Identification
In most of the research studies on Author Profiling, large quantities of correctly labeled data are used to train the models. However, this does not reflect the reality in forensic scenarios: in practical linguistic forensic investigations, the resources that are available to profile the author of a text are usually scarce. To pay tribute to this fact, we implemented a Semi-Supervised Learning ...
متن کاملA Semi-supervised Ensemble Approach for Mining Data Streams
There are many challenges in mining data streams, such as infinite length, evolving nature and lack of labeled instances. Accordingly, a semi-supervised ensemble approach for mining data streams is presented in this paper. Data streams are divided into data chunks to deal with the infinite length. An ensemble classification model E is trained with existing labeled data chunks and decision bound...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information
سال: 2014
ISSN: 2078-2489
DOI: 10.3390/info5040634